MTIL17: English to Indian Langauge Statistical Machine Translation

نویسندگان

  • Raj Nath Patel
  • Prakash B. Pimpale
  • Sasikumar M
چکیده

English to Indian language machine translation poses the challenge of structural and morphological divergence. This paper describes English to Indian language statistical machine translation using pre-ordering and suffix separation. The pre-ordering uses rules to transfer the structure of the source sentences prior to training and translation. This syntactic restructuring helps statistical machine translation to tackle the structural divergence and hence better translation quality. The suffix separation is used to tackle the morphological divergence between English and highly agglutinative Indian languages. We demonstrate that the use of pre-ordering and suffix separation helps in improving the quality of English to Indian Language machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nlp Challenges for Machine Translation from English to Indian Languages

This Natural Langauge processing is carried particularly on English-Kannada/Telugu. Kannada is a language of India. The Kannada language has a classification of Dravidian, Southern, Tamil-Kannada, and Kannada. Regions Spoken: Kannada is also spoken in Karnataka, Andhra Pradesh, Tamil Nadu, and Maharashtra. Population: The total population of people who speak Kannada is 35,346,000, as of 1997. A...

متن کامل

An Empirical Survey on Automatic Machine Translation between English and Indian Languages

In this paper, we have reported our survey on systems and projects that intend to translate between English and Indian languages. Most of the translators and projects aim to translate from English to more than one Indian languages. The main challenge is due to the fact that Indian languages are quite different from European languages. In this paper, we have explored the following the following ...

متن کامل

Statistical Vs Rule Based Machine Translation; A Case Study on Indian Language Perspective

In this paper we present our work on a case study between Statistical Machien Transaltion (SMT) and Rule-Based Machine Translation (RBMT) systems on English-Indian langugae and Indian to Indian langugae perspective. Main objective of our study is to make a five way performance compariosn; such as, a) SMT and RBMT b) SMT on English-Indian langugae c) RBMT on English-Indian langugae d) SMT on Ind...

متن کامل

A Hybrid Approach to Example based Machine Translation for Indian Languages

Corpus based approaches to machine translation namely Example based machine translation and Statistical machine translation have received wide focus in the recent years. Hybrid approaches combining the two further improved the performance. Indian language machine translation has mostly focussed on rule based machine translation. We propose a hybrid approach to Example based machine translation ...

متن کامل

Statistical Machine Translation for Indian Languages: Mission Hindi 2

This paper presents Centre for Development of Advanced Computing Mumbai’s (CDACM) submission to NLP Tools Contest on Statistical Machine Translation in Indian Languages (ILSMT) 2015 (collocated with ICON 2015). The aim of the contest was to collectively explore the effectiveness of Statistical Machine Translation (SMT) while translating within Indian languages and between English and Indian lan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1708.07950  شماره 

صفحات  -

تاریخ انتشار 2017